Finding Reverse Substrings in DNA

نویسندگان

  • Kathleen M. Kaplan
  • John J. Kaplan
چکیده

Extended Abstract In DNA, one section of a string may be reversed. Since this occurs naturally, it would be beneficial to be able to match two strings and note where these reverse subsequences occur. The K-algorithm provides a way to detect reverse substrings. It has worst-case time complexity of O(m3/2n), yet performs quickly in application due to its parallelistic nature.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A cost-aggregating integer linear program for motif finding

In the motif finding problem one seeks a set of mutually similar substrings within a collection of biological sequences. This is an important and widely-studied problem, as such shared motifs in DNA often correspond to regulatory elements. We study a combinatorial framework where the goal is to find substrings of a given length such that the sum of their pairwise distances is minimized. We desc...

متن کامل

Efficient Enumeration of Phylogenetically Informative Substrings

We study the problem of enumerating substrings that are common amongst genomes that share evolutionary descent. For example, one might want to enumerate all identical (therefore conserved) substrings that are shared between all mammals and not found in non-mammals. Such collection of substrings may be used to identify conserved subsequences or to construct sets of identifying substrings for bra...

متن کامل

Efficient algorithms for the longest common subsequence in $k$-length substrings

Finding the longest common subsequence in k-length substrings (LCSk) is a recently proposed problem motivated by computational biology. This is a generalization of the well-known LCS problem in which matching symbols from two sequences A and B are replaced with matching non-overlapping substrings of length k from A and B. We propose several algorithms for LCSk, being non-trivial incarnations of...

متن کامل

An Efficient Algorithm for Finding Similar Short Substrings from Large Scale String Data

Finding similar substrings/substructures is a central task in analyzing huge amounts of string data such as genome sequences, web documents, log data, etc. In the sense of complexity theory, the existence of polynomial time algorithms for such problems is usually trivial since the number of substrings is bounded by the square of their lengths. However, straightforward algorithms do not work for...

متن کامل

Finding All Tandem Arrays in DNA Sequences

A tandem array is a substring of the form k x , where x is any unspecific substring and k is at least two (when k is 2, 2 x is also called a tandem repeat or square). A non-extendable tandem array occurring in string S is a tandem array k x which are not followed or preceded by another occurrence of x in S. The problem of this thesis is defined as follows: Given a string S of length n, find all...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002